Model Selection

Multi-task Visual Understanding

# Multi-task Visual Understanding

PE Spatial G14 448

The Perception Encoder (PE) is a state-of-the-art image and video understanding encoder trained through simple vision-language learning.

Florence 2 Base

Florence-2 is an advanced vision foundation model developed by Microsoft, employing a prompt-based approach to handle a wide range of vision and vision-language tasks.

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase